Goto

Collaborating Authors

 Woodlands County


On the equivalence of molecular graph convolution and molecular wave function with poor basis set

Neural Information Processing Systems

In this study, we demonstrate that the linear combination of atomic orbitals (LCAO), an approximation introduced by Pauling and Lennard-Jones in the 1920s, corresponds to graph convolutional networks (GCNs) for molecules. However, GCNs involve unnecessary nonlinearity and deep architecture. We also verify that molecular GCNs are based on a poor basis function set compared with the standard one used in theoretical calculations or quantum chemical simulations. From these observations, we describe the quantum deep field (QDF), a machine learning (ML) model based on an underlying quantum physics, in particular the density functional theory (DFT). We believe that the QDF model can be easily understood because it can be regarded as a single linear layer GCN. Moreover, it uses two vanilla feedforward neural networks to learn an energy functional and a Hohenberg--Kohn map that have nonlinearities inherent in quantum physics and the DFT. For molecular energy prediction tasks, we demonstrated the viability of an ``extrapolation,'' in which we trained a QDF model with small molecules, tested it with large molecules, and achieved high extrapolation performance. We believe that we should move away from the competition of interpolation accuracy within benchmark datasets and evaluate ML models based on physics using an extrapolation setting; this will lead to reliable and practical applications, such as fast, large-scale molecular screening for discovering effective materials.


Magnetic activity of ultracool dwarfs in the LAMOST DR11

Xiang, Yue, Gu, Shenghong, Cao, Dongtao

arXiv.org Artificial Intelligence

Ultracool dwarfs consist of lowest-mass stars and brown dwarfs. Their interior is fully convective, different from that of the partly-convective Sun-like stars. Magnetic field generation process beneath the surface of ultracool dwarfs is still poorly understood and controversial. To increase samples of active ultracool dwarfs significantly, we have identified 962 ultracool dwarfs in the latest LAMOST data release, DR11. We also simulate the Chinese Space Station Survey Telescope (CSST) low-resolution slitless spectra by degrading the LAMOST spectra. A semi-supervised machine learning approach with an autoencoder model is built to identify ultracool dwarfs with the simulated CSST spectra, which demonstrates the capability of the CSST all-sky slitless spectroscopic survey on the detection of ultracool dwarfs. Magnetic activity of the ultracool dwarfs is investigated by using the H$α$ line emission as a proxy. The rotational periods of 82 ultracool dwarfs are derived based on the Kepler/K2 light curves. We also derive the activity-rotation relation of the ultracool dwarfs, which is saturated around a Rossby number of 0.12.


Is Image-based Object Pose Estimation Ready to Support Grasping?

Joyce, Eric C., Zhao, Qianwen, Burgdorfer, Nathaniel, Wang, Long, Mordohai, Philippos

arXiv.org Artificial Intelligence

We present a framework for evaluating 6-DoF instance-level object pose estimators, focusing on those that require a single RGB (not RGB-D) image as input. Besides gaining intuition about how accurate these estimators are, we are interested in the degree to which they can serve as the sole perception mechanism for robotic grasping. To assess this, we perform grasping trials in a physics-based simulator, using image-based pose estimates to guide a parallel gripper and an underactuated robotic hand in picking up 3D models of objects. Our experiments on a subset of the BOP (Benchmark for 6D Object Pose Estimation) dataset compare five open-source object pose estimators and provide insights that were missing from the literature.


Few-shot Protein Fitness Prediction via In-context Learning and Test-time Training

Teufel, Felix, Kollasch, Aaron W., Huang, Yining, Winther, Ole, Yang, Kevin K., Notin, Pascal, Marks, Debora S.

arXiv.org Artificial Intelligence

Accurately predicting protein fitness with minimal experimental data is a persistent challenge in protein engineering. We introduce PRIMO (PRotein In-context Mutation Oracle), a transformer-based framework that leverages in-context learning and test-time training to adapt rapidly to new proteins and assays without large task-specific datasets. By encoding sequence information, auxiliary zero-shot predictions, and sparse experimental labels from many assays as a unified token set in a pre-training masked-language modeling paradigm, PRIMO learns to prioritize promising variants through a preference-based loss function. Across diverse protein families and properties-including both substitution and indel mutations-PRIMO outperforms zero-shot and fully supervised baselines. This work underscores the power of combining large-scale pre-training with efficient test-time adaptation to tackle challenging protein design tasks where data collection is expensive and label availability is limited.


Machine Learning Time Propagators for Time-Dependent Density Functional Theory Simulations

Shah, Karan, Cangi, Attila

arXiv.org Artificial Intelligence

Time-dependent density functional theory (TDDFT) is a widely used method to investigate electron dynamics under external time-dependent perturbations such as laser fields. In this work, we present a machine learning approach to accelerate electron dynamics simulations based on real time TDDFT using autoregressive neural operators as time-propagators for the electron density. By leveraging physics-informed constraints and featurization, and high-resolution training data, our model achieves superior accuracy and computational speed compared to traditional numerical solvers. We demonstrate the effectiveness of our model on a class of one-dimensional diatomic molecules under the influence of a range of laser parameters. This method has potential in enabling on-the-fly modeling of laser-irradiated molecules and materials by utilizing fast machine learning predictions in a large space of varying experimental parameters of the laser.


Implicit Neural Field-Based Process Planning for Multi-Axis Manufacturing: Direct Control over Collision Avoidance and Toolpath Geometry

Dutta, Neelotpal, Zhang, Tianyu, Liu, Tao, Chen, Yongxue, Wang, Charlie C. L.

arXiv.org Artificial Intelligence

Existing curved-layer-based process planning methods for multi-axis manufacturing address collisions only indirectly and generate toolpaths in a post-processing step, leaving toolpath geometry uncontrolled during optimization. We present an implicit neural field-based framework for multi-axis process planning that overcomes these limitations by embedding both layer generation and toolpath design within a single differentiable pipeline. Using sinusoidally activated neural networks to represent layers and toolpaths as implicit fields, our method enables direct evaluation of field values and derivatives at any spatial point, thereby allowing explicit collision avoidance and joint optimization of manufacturing layers and toolpaths. We further investigate how network hyperparameters and objective definitions influence singularity behavior and topology transitions, offering built-in mechanisms for regularization and stability control. The proposed approach is demonstrated on examples in both additive and subtractive manufacturing, validating its generality and effectiveness.


Best Practices for Biorisk Evaluations on Open-Weight Bio-Foundation Models

Wei, Boyi, Che, Zora, Li, Nathaniel, Sehwag, Udari Madhushani, Götting, Jasper, Nedungadi, Samira, Michael, Julian, Yue, Summer, Hendrycks, Dan, Henderson, Peter, Wang, Zifan, Donoughe, Seth, Mazeika, Mantas

arXiv.org Artificial Intelligence

Open-weight bio-foundation models present a dual-use dilemma. While holding great promise for accelerating scientific research and drug development, they could also enable bad actors to develop more deadly bioweapons. To mitigate the risk posed by these models, current approaches focus on filtering biohazardous data during pre-training. However, the effectiveness of such an approach remains unclear, particularly against determined actors who might fine-tune these models for malicious use. To address this gap, we propose BioRiskEval, a framework to evaluate the robustness of procedures that are intended to reduce the dual-use capabilities of bio-foundation models. BioRiskEval assesses models' virus understanding through three lenses, including sequence modeling, mutational effects prediction, and virulence prediction. Our results show that current filtering practices may not be particularly effective: Excluded knowledge can be rapidly recovered in some cases via fine-tuning, and exhibits broader generalizability in sequence modeling. Furthermore, dual-use signals may already reside in the pretrained representations, and can be elicited via simple linear probing. These findings highlight the challenges of data filtering as a standalone procedure, underscoring the need for further research into robust safety and security strategies for open-weight bio-foundation models.